Ribosomal profiling of human endogenous retroviruses in healthy tissues

Human endogenous retroviruses (HERVs) are the germline embedded proviral fragments of ancient retroviral infections that make up roughly 8% of the human genome. Our understanding of HERVs in physiology primarily surrounds their non-coding functions, while their protein coding capacity remains virtually uncharacterized. Therefore, we applied the bioinformatic pipeline “hervQuant” to high-resolution ribosomal profiling of healthy tissues to provide a comprehensive overview of translationally active HERVs. We find that HERVs account for 0.1–0.4% of all translation in distinct tissue-specific profiles. Collectively, our study further supports claims that HERVs are actively translated throughout healthy tissues to provide sequences of retroviral origin to the human proteome. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-023-09909-x.


Background
Human endogenous retroviruses (HERVs) persist within the genome as the legacy of ancient retroviral infections that integrated into the germline [1,2].Germline embedded retroviruses then transmit vertically where over time they then accumulate mutations or deletions that prohibit infectious particle formation.Once a retrovirus no longer produces infectious particles, they are deemed "endogenous" [1].Endogenization is not an instantaneous process, but instead occurs through complex transgenerational invasion of genomic sequences by a retrovirus, as demonstrated by the active endogenization events occurring in Koala species [3].Once retrovirus has invaded the germline of a host specific, endogenization can then be driven by a multitude of factors, such as xenotropic restriction [4], mutations [1,2], host-antiviral responses [5], and recombination events [6].Collectively, HERVs make up about 8% of human genetic material [2,7,8], and have therefore substantially impacted the genome.While HERVs are mostly inactive [9] and none are replication competent like the ERVs of other mammals [10], many do display spatiotemporal activity in somatic [11][12][13][14] and developing cells [15][16][17][18][19][20][21] alike.Since their endogenization, many HERV elements have been coopted to accomplish molecular tasks in which are observable throughout reproduction [22,23], immune responses [24,25], and cell type specific transcription [11,17,19,26].Our current understanding of HERVs is primarily derived from their genomic and transcriptomic functions while little is known about their protein encoding capabilities.
Here, we performed the first large-scale characterization of HERV translation in healthy tissues by analyzing publicly available ribosomal profiling (RiboSeq) datasets [27].RiboSeq quantifies the translatome by sequencing the short fragments (~ 25-35 bps) of ribosomal protected RNA, therefore providing a 'snapshot' of protein production [28].By applying the bioinformatic pipeline 'hervQuant' [29] to publicly available RiboSeq data, we quantify the translational abundance of over 3000 annotated HERV proviruses [30] across an atlas of healthy tissue and cell types by aligning ribosomal protected short RNA sequencing fragments to full length proviruses.Collectively, this approach provides the first comprehensive characterization of actively translated HERV proviruses under healthy conditions.We term the collective of HERV proteins undergoing translation as the "endoretrotranslatome" (ERT) and suggest further investigation into the ERT as an understudied component of human health.
Our analyses demonstrate that HERV-provirus aligned reads make up a surprising portion of the human translatome, encompassing roughly between 0.1-0.4% of all translation in a site-specific manner.Unsurprisingly, the ERT displays substantial diversity across tissue sites.As the expression of HERVs at the RNA level is tightly regulated by an excessive complement of epigenetic modifications [9], their translation with little interindividual discrepancies suggests that their expression at the protein level is likely by design and not inadvertent.Post-translation, HERV protein stability and function may be rapidly compromised by the host via post translational modifications [81] or by the targeted clearance of dysfunctional protein aggregates [82,83], and therefore a limitation of this study pertains to their unknown half-life.In example, our results find that paraneoplastic Ma antigen 1 (PNMA1), a domesticated LTR retrotransposon capsid containing a neuronal autoantigen associated with paraneoplastic neurological pathologies [84], is translated throughout all tissue types tested (Fig. S5).Therefore, going forward considering the rate of transcription, translation, and degradation would provide the most comprehensive determination of HERV activity [85].

Conclusions
In this study, we demonstrate that HERVs, acquired via ancient retroviral infections, are translationally active elements.Previous misconceptions suggested that HERVs were merely inert or parasitic sequences, however it is now appreciated that HERVs innervate host physiology [86], regulate transcriptional networks [87,88], contribute to the transcriptome [11][12][13], and provide retroviral motifs that propagate immunity [24,25].Here, we demonstrate that HERVs are translated in greater than anticipated proportions, and that HERV proteins are a reservoir of poorly defined macromolecules that may impact human health and disease.Previous studies have shown that a diverse profile of HERVs are expressed that the RNA level throughout various tissue sites, and that HERV RNAs make up roughly 0.19-1.91% of all polyadenylated RNA in site-specific manners [12].Additionally, the authors demonstrate HERV RNA activity is sensitive to confounding variables, such as background and age [12].Transcriptional activity of the HML and HERHF superfamilies, which we found to be most abundant in the ERT, has previously been detected in fully differentiated somatic tissues [12,13,89,90].Additionally, in ESCs many HERV elements are derepressed, and HERVH elements are highly active and contribute to cellular ESC cell specific processes [16,91].Therefore, it is unsurprising that we see the highest proportions of HERV translation globally and from the HERVHF family in ESCs.
In accordance with previous observations of HERV activity in the transcriptome and genome, we now demonstrate that HERV RNAs can be found in the ribosome of healthy human tissues.While ribosomal RNA content does not perfectly equate to stable protein levels, as demonstrated by the translational abundances of PNMA1 which is absent in the protein content of healthy cells [92], it does suggest that HERV elements are participating in the intricacies cellular biology than previously considered.We emphasize that future studies which investigate the translational efficiency and stability of HERV proteins, and whether pre-or posttranslational modifications contributing to their clearance go awry in diseases associated with HERV protein abundance, are of the utmost importance, and continued characterization of the ERT will provide valuable insight into the mysterious mechanisms by which ancient retroviral genes underlie cellular processes as potentially viable and unstudied protein coding genes.These results also suggest reassessment of previous nomenclature that, while lowly abundant in the translatome, might have considered HERVs to be non-coding genes.

Data and code availability
All original code utilized for this study can be found at https:// github.com/ nixon lab/ te_ ribos eq_ atlas.The code for quantifying HERV-provirus aligning reads was adapted from the previously developed hervQuant pipeline [29] which can be found at https:// uncli neber ger.org/ vince ntlab/ resou rces/.Post hoc visualization of HERV provirus loci was performed with Integrated Genomics Viewer (IGV) [93] desktop application available at https:// softw are.broad insti tute.org/ softw are/ igv/.Scatter plots and heatmaps were generated with GraphPad Prism version 9.3.1 available at https:// www.graph pad.com/ scien tific-softw are/ prism/.Biplots displaying PCA differentiation of samples were generated using PCATools available at https:// github.com/ kevin blighe/ PCAto ols.
• fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations maximum visibility for your research: over 100M website views per year

•
At BMC, research is always in progress.

Learn more biomedcentral.com/submissions
Ready to submit your research Ready to submit your research ?Choose BMC and benefit from: ? Choose BMC and benefit from:

Fig. 1 Fig. 2
Fig. 1 Ribosomal profiling reveals active translation of HERV proviruses in healthy tissue and cell types.a Schematic overview of workflow for profiling HERV proviral abundances from RiboSeq data.b HERV-aligned reads as a percentage of all filtered sequencing reads per sample.Dots indicate individual biological replicates with the graphed mean.Error bars indicate ± standard error of the mean (SEM).c Sum number of HERV proviruses possessing ≥1RPM per sample.Dots indicate individual biological replicates with the graphed mean.Error bars indicate ± SEM. d PCA plot of all tissue and cell types based on HERV-aligned ribosomal profiling reads alone.e Individual sample RPM abundances of all HERV proviruses per sample clustered per cell or tissue type.HERVs are listed in descending order by average RPM abundance.f Average RPM abundances of all HERV proviruses per cell or tissue type.HERVs are listed in descending order by average RPM abundance